How Clean is Clean Enough? Determining the Most Effective Use of Resources in the Data Cleansing Process
نویسندگان
چکیده
Poor data quality can have a significant impact on system and organizational performance. With significant increase in data gathering and storage, the number of sources of data that must be merged in data warehouse and Enterprise Resource Planning (ERP) implementations has increased significantly. This makes data cleansing as part of the implementation conversion, increasingly difficult. In this research we expand the traditional Extraction-Load-Transform (ETL) process to identify subprocesses between the main stages. We then identify the decisions and tradeoffs related to the various decisions on allocation of time, resources and accuracy constraints on the data cleansing process. We develop a mathematical model of the process to identify the optimal configuration of these factors in data cleansing process. We use empirical data to test the feasibly of the proposed model. Multiple domain experts validate the range of constraints used for model testing. Three different levels of cleansing complexity are tested in the preliminary analysis to demonstrate the use and validity of the modeling process.
منابع مشابه
Studying indicators of traditional architecture of Shiraz houses in order to provide a suitable model for contemporary housing design in order to use clean energy
rchitecture is always influenced by various indicators, the most important of which are climatic and physical-spatial indicators. These indicators are well observed in traditional homes and have played an important role in the use of clean energy. In this study, the aim is to study the climatic and spatial indicators of traditional architecture of Shiraz houses in order to provide a suitable mo...
متن کاملClean Hydrogen Energy and Electric Power Production with CO2 Capturing by Using Coal Gasification
Clean hydrogen is the major energy carrier for power production. The conversion of CO to CO2 and zero emission during hydrogen energy production causes high capital cost. It is a matter of prestige to optimize the process in order to make zero emission and cost effective production of clean hydrogen energy and electric power. In this era, coal gasification is th...
متن کاملPreparation of U3O8 by Calcination from Ammonium Uranyl Carbonate Using Response Surface Methodology: Process Optimization
The parameters to prepare U3O8 by calcination from ammonium uranyl carbonate were optimized by using response surface methodology. A quadratic equation model for the value of total uranium and U4+ of triuranium octaoxide was built and the effects of main factors and their corresponding relationships were obtained. The statistical analysis of the results indi...
متن کاملDetermination of vulnerability of aquifer Ardebil using DRASTIC method in GIS
Background and Objective: Groundwater resources are the most valuable resources of each country. Development of agricultural activities in Ardabil plain and over-use of fertilizers and pesticides, improper disposal of municipal sewage and industrial areas are responsible for groundwater pollution. Clean-up of groundwater resources is very difficult and expensive. One of suitable method in preve...
متن کاملتعیین مالیات زیستمحیطی بهینه در الگوی رشد تعمیمیافته با وجود انتقال تکنولوژی پاک و کیفیت محیطزیست: نمونۀ اقتصاد ایران
This study aims at determining the optimal environmental tax policy in the context of a dynamic model. For this purpose, clean technology diffusion was added to the AK growth model and the theoretical model has been generalized to the open economy. The main feature of the economy is creating pollution in the process of economic growth and its negative impact on social welfare. The diffusion...
متن کامل